Outlier Detection by Forecasting

نویسندگان

  • Brendan Livingston
  • Nathan McDermott
چکیده

The Consumer Expenditure Quarterly Interview Survey collects data from consumer units (CUs) about their expenses during the previous 3 months. The purpose of the survey is to gather information about large purchases, such as those of vehicles and appliances, and expenditures that are made on a regular basis, such as rent and utility payments. These data are collected by the U.S. Census Bureau and then transferred to the U.S. Bureau of Labor Statistics (BLS), Division of Consumer Expenditure Surveys (CE). The branch of Production and Control (P&C) screens and processes the raw data for their eventual use in publications and in the weighting of the BLS Consumer Price Index. P&C’s final data-editing procedure for the Interview Survey is the Monthly Tabulation of Expenditures (MTAB), which maps or assigns expenditures to a specific month and a Universal Classification Code (UCC).1 The MTAB Review procedure then evaluates the created data for suspicious values. To improve the existing review procedure, P&C initiated a research project in August 2005. The goals of this project were to make the MTAB Review more efficient, focus analysts’ attention on outliers, create more informative reports, and provide more accurate data to end users. Three techniques for improving the process of selecting outliers were investigated during the modernization of the MTAB Review. The method that was chosen, which compared forecasted with reported values, was implemented in February 2006. With this technique, the analyst detects outliers by using forecasted prediction intervals created by SAS and comparing them with current means.2 This article summarizes the forecasting technique adopted for the MTAB Review.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A practical approach to forecast Quality of Service parameters considering outliers

Autoregressive integrated moving average (ARIMA) models are used in different researches for modelling and forecasting of traffic and Quality of Service (QoS) parameter values in telecommunication networks to make reasonable short, mediumand long-term predictions. We propose methodology to use ARIMA models for QoS prediction in network scenarios based on a preliminary detection and elimination ...

متن کامل

An Approach Based on Multi-feature Wavelet and Elm Algorithm for Forecasting Outlier Occurrence in Chinese Stock Market

The prediction of outliers plays an important role in stock arbitrage and risk avoiding. While most of researches focused on detecting outliers and removing them to forecast time series data, few focused on forecasting the occurrence of outliers. The main goal of this work is to forecast outlier occurrence in Chinese stock market. Firstly, we detect abnormal points of two market indexes and six...

متن کامل

Identification of outliers types in multivariate time series using genetic algorithm

Multivariate time series data, often, modeled using vector autoregressive moving average (VARMA) model. But presence of outliers can violates the stationary assumption and may lead to wrong modeling, biased estimation of parameters and inaccurate prediction. Thus, detection of these points and how to deal properly with them, especially in relation to modeling and parameter estimation of VARMA m...

متن کامل

Control chart based on residues: Is a good methodology to detect outliers?

The purpose of this article is to evaluate the application of forecasting models along with the use of residual control charts to assess production processes whose samples have autocorrelation characteristics. The main objective is to determine the efficiency of control charts for individual observations (CCIO) and exponentially weighted moving average (EWMA) charts when they are applied to res...

متن کامل

Outlier Detection by Boosting Regression Trees

A procedure for detecting outliers in regression problems is proposed. It is based on information provided by boosting regression trees. The key idea is to select the most frequently resampled observation along the boosting iterations and reiterate after removing it. The selection criterion is based on Tchebychev’s inequality applied to the maximum over the boosting iterations of ...

متن کامل

Outlier Detection in Wireless Sensor Networks Using Distributed Principal Component Analysis

Detecting anomalies is an important challenge for intrusion detection and fault diagnosis in wireless sensor networks (WSNs). To address the problem of outlier detection in wireless sensor networks, in this paper we present a PCA-based centralized approach and a DPCA-based distributed energy-efficient approach for detecting outliers in sensed data in a WSN. The outliers in sensed data can be ca...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008